Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces
نویسندگان
چکیده
Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile (VVC/H.266) primarily YUV 4:2:0 format, where U V components subsampled by considering human visual system. This paper investigates various DLEC designs support format comparing their performance against main profiles HEVC VVC under common evaluation framework. Moreover, new transform network architecture is proposed improve efficiency data. The experimental results on datasets show that significantly outperforms naive extensions achieves about 10% average BD-rate improvement over intra-frame HEVC.
منابع مشابه
FIND: Service-Centric End-to-End Abstractions in Network Architectures
Project Summary Next-generation network architectures will be governed by the need for flexibility. Heterogeneous end-systems, novel communication abstractions, and security and manageability challenges will require networks to provide a broad range of services that go beyond the simple store-and-forward capabilities of today's Internet. The proposed work introduces new abstractions for informa...
متن کاملDeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess
We present an end-to-end learning method for chess, relying on deep neural networks. Without any a priori knowledge, in particular without any knowledge regarding the rules of chess, a deep neural network is trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given position, and the supervised training le...
متن کاملEnd-to-End Deep Neural Network for Automatic Speech Recognition
We investigate the efficacy of deep neural networks on speech recognition. Specifically, we implement an end-to-end deep learning system that utilizes mel-filter bank features to directly output to spoken phonemes without the need of a traditional Hidden Markov Model for decoding. The system will comprise of two variants of neural networks for phoneme recognition. In particular, we utilize conv...
متن کاملEnd-to-End Optimized Speech Coding with Deep Neural Networks
Modern compression algorithms are often the result of laborious domain-specific research; industry standards such as MP3, JPEG, and AMR-WB took years to develop and were largely hand-designed. We present a deep neural network model which optimizes all the steps of a wideband speech coding pipeline (compression, quantization, entropy coding, and decompression) end-to-end directly from raw speech...
متن کاملTVM: End-to-End Optimization Stack for Deep Learning
Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch drive the current popularity and utility of deep learning. However, these frameworks are optimized for a narrow range of server-class GPUs and deploying workloads to other platforms such as mobile phones, embedded devices, and specialized accelerators (e.g., FPGAs, ASICs) requires laborious manual effort. We propose TVM, an end-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE open journal of signal processing
سال: 2021
ISSN: ['2644-1322']
DOI: https://doi.org/10.1109/ojsp.2021.3092257